Administrative data ICD-10 diagnostic codes identifies most lab-confirmed SARS-CoV-2 admissions but misses many discharged from the Emergency Department

We estimated the operating characteristics of ICD-10 code U07.1, introduced by the World Health Organization in 2020, to identify lab-confirmed SARS-CoV-2. CCEDRRN is a national research registry of adults (March 2020–August 2021) with suspected/confirmed SARS-CoV-2 identified in Canadian emergency departments (EDs) using chart review (symptoms, clinical information, and lab test results including SARS-CoV-2 polymerase chain reaction, PCR results). CCEDRRN data were linked to administrative hospitalization discharge and ED ICD-10 diagnostic codes (accessed centrally via the Canadian Institute for Health Information). We identified ICD-10 diagnostic codes in CCEDRRN participants. We defined lab-confirmed SARS-CoV-2 based on at least one positive PCR in the 0–14 days before the ED presentation and/or during hospitalization (in those admitted from ED). We performed separate analyses for CCEDRRN participants discharged from ED and those hospitalized from the ED. Additional analyses were stratified by province, sex, age, and (for hospitalized patients) timing of the first PCR test. The sensitivity of ICD-10 code U07.1 for a positive SARS-CoV-2 test was 93.6% (95% CI 93.0–94.1%) in those hospitalized from ED and 83.0% (95% CI 82.1–83.9%) in those discharged from the ED. Sensitivity was similar across provinces and demographics, but in each stratified analysis, values were higher in those hospitalized versus those discharged from ED. The ICD-10 diagnostic code for U07.1 within administrative data identified most lab-confirmed SARS-CoV-2 within persons hospitalized from ED, although a significant number of cases discharged from ED were missed. This should be considered when using administrative data for research and public health planning.

www.nature.com/scientificreports/ to administrative diagnostic codes.CCEDRRN collected data very early in the pandemic when universal testing was not available to the community.Thus, the vast majority of people were only tested in hospital or the ED.

Study sample
CCEDRRN is a research registry of consecutive individuals with suspected/confirmed SARS-CoV-2 infection presenting to 51 urban and rural emergency departments (EDs) in eight Canadian provinces (British Columbia, Alberta, Manitoba, Saskatchewan, Quebec, Ontario, Nova Scotia, New Brunswick) from March 1, 2020-August 2021 [1][2][3] .The registry obtained ethics approval to enroll participants into the registry with a waiver for informed consent, allowing us to capture a complete sample.Participants with suspected or confirmed COVID-19 presenting to one of the participating EDs were enrolled in the study using pre-defined clinical criteria (more details published elsewhere) 1-3 .In summary, patients were included in the study in two distinct periods (depending on the province and based on the availability of COVID-19 testing-see Supplemental Material, Table A).The first period's (covering the early phase of the pandemic up to April-May 2020) criteria included fever and one respiratory symptom (including flu-like illness, shortness of breath or cough) or presenting to the ED and tested for SARS-CoV-2 in the ED.The second period started on the date each province expanded testing criteria, allowing clinicians to test patients based on clinical suspicion or policy.Inclusion criteria in this period encompassed: (1) patients tested for SARS-CoV-2 in the ED or within 24 h of arrival and (2) patients presenting to the ED within 14 days of a positive SARS-CoV-2 test and presenting with clinical symptoms of COVID-19.In this period, elective, non-ED admissions were excluded.We excluded patients without available PCR tests for this study.
Standardized data abstracted from medical records includes demographics, symptoms, SARS-CoV-2 risk factors (e.g., travel, work, contacts), selected comorbidities, procedures, medications, SARS-CoV-2 RNA reverse transcription-polymerase chain reaction (PCR) testing, other lab results, and hospitalization details for those whose ED presentation resulted in admission.Of all ED visits in the CCEDRRN dataset, 95% have at least one PCR test available (an inclusion criteria for our current analyses), including negative, positive, and indeterminate/unknown results.
CCEDRRN has REB approval to link registry data (via each person's unique provincial health number) with electronic administrative health databases with ICD-10 diagnostic codes (including U07.1) assigned during ED visits and during hospitalizations if admitted from the ED (here discharge data included deaths within the hospital stay).Administrative data were accessed via the Canadian Institute for Health Information, CIHI 4 .CIHI is an agency created by Canada's federal, territorial, and provincial governments (except Quebec, which contributes limited data and thus is not included in our analyses-except for the overall description of the CCEDRRN registry) 4 .CIHI's health system databases include the Discharge Abstract Database (DAD), which captures administrative health information about hospitalizations, and the National Ambulatory Care Reporting System (NACRS), capturing emergency and ambulatory care visits.In the period of our study, facilities from the province of British Columbia did not provide NACRS ICD code data; therefore, this province was not included in the analyses of individuals discharged from the ED.
The sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV), along with 95% confidence intervals (CI), were estimated for the CCEDRRN-CIHI sample.We performed separate analyses for ED visits that resulted in discharge and those resulting in hospitalization from the ED.

ICD-10 code U07.1
In the CCEDRRN-CIHI sample, we identified all administrative data U07.1 diagnostic codes from ED visits (and hospitalizations from ED when this occurred).We then assessed the performance of ICD-10 code U07.1 (laboratory-confirmed SARS-CoV-2) compared to our reference standard, PCR test results during our study time interval, i.e., 0-14 days before (or during) the ED visit or during hospital stay for those admitted from the ED.We limited our analyses to ED visits with at least one PCR test within that interval (except for the first few weeks of the pandemic, CCEDRRN enrollment required a PCR test; thus, about 95% of all CCEDRRN patients have at least one PCR SARS-CoV-2 test).
To analyze the operating characteristics of ICD-10 code U07.1 related to administrative data ED visit and/ or hospital discharge diagnostic codes, we defined true positives (TP) as CCEDRRN-CIHI ED visits whose electronic administrative health data included ICD-10 diagnostic code U07.1 and had at least one positive PCR test at any time from 0 to 14 days before (or during) the ED visit or during the hospital stay in those admitted from the ED.Individuals with multiple tests within the period were considered a true case if at least one positive test.False positives (FP) were those ED visits that had an administrative data ICD-10 diagnostic code U07.1 and no positive PCR SARS-CoV-2 test result (but at least one test with negative or indeterminate/unknown result) documented in CCEDRRN that related to the 0-14 days before (or during) the ED visit and/or during hospitalization, in those admitted from ED. False negatives (FN) were CCEDRRN-CIHI ED visits without an administrative data ICD-10 diagnostic code U07.1 with at least one positive PCR test documented in CCEDRRN within the same interval.True negatives (TN) were those ED visits without ICD-10 code U07.1 and no positive PCR test (at least one test was done and recorded as negative or indeterminant).

Stratified and sensitivity analyses
As noted earlier, in our main analyses, we performed separate analyses for CCEDRRN participants who were discharged from ED and those hospitalized from the ED.Additional stratified analyses were carried out to investigate potential differences in operating characteristics across provinces, across sex and age groups (< 50 years, 50-75 years, and > 75 years old), calendar periods, and selected comorbidities (asthma, pulmonary fibrosis, and chronic lung disease).For hospitalized patients, we also stratified by timing of the first PCR test.

CCEDRRN characteristics
The original CCEDRRN registry comprised 138,676 ED visits involving 112,995 participants enrolled between Mar. 1, 2020, and Aug. 27, 2021.Across all ED visits, the participant distribution was nearly equal between males and females, with a median age of 58 and an interquartile range of 39-74.Notably, three-quarters of participants came from the most populous provinces, Quebec, British Columbia and Ontario.Further details regarding these ED visits are provided in Table 1.
We studied 77,000 ED visits from the original CCEDRRN registry that had at least one PCR test (done during outpatient clinics, ED visits, or hospitalization) and linked to electronic administrative health data.The linkage included 31,430 home-discharged ED visits and 45,570 hospitalizations from the ED (see Fig. 1).
Among participants who were admitted to hospital from the ED, the sensitivity of diagnostic code U07.1 (from the linked administrative data) to detect lab-confirmed SARS-CoV-2 was 93.6% (95% CI 93.0-94.1%).The sensitivity of code U07.1 for lab-confirmed SARS-CoV-2 in CCEDRRN participants discharged from ED was 83.0% (95% CI 82.1-83.9%).The remaining operating characteristics for the main analyses are detailed in Table 2. Specificity, PPV, and NPV estimates were always better in individuals admitted from the ED.For example, in those hospitalized from the ED, the PPV of administrative data ICD diagnostic code U07.1 was 98.6% (95% CI 98.4-98.9%)versus 90.1% (95% CI 89.4-90.8%)for patients discharged from the ED.The sensitivity for those discharged from the ED was particularly low in the over-75 age group.
Tables 3 and 4 present stratified analyses; sensitivity was similar across provinces and demographics, but in each stratified analysis, values were higher in hospitalizations versus those discharged from the ED.In the hospitalized sample, sensitivity was highest if the first PCR test occurred 0-14 days before ED presentation.

Discussion
We found high sensitivity, specificity and PPV for ICD-10-diagnostic code U07.1 when positive PCR testing was considered the reference standard.ICD-10 code U07.1 had higher sensitivity, specificity and PPV in those hospitalized from the ED versus those discharged from the ED.This may be as expected, given that hospitalized patients would be sicker (often having a higher viral load, thus more likely to have a positive PCR test) and have more opportunities to have repeat PCR tests 5 .Given our prior finding that the sensitivity of PCR testing is very high in the ED and does not drop during the first few days of admission 5 , the poorer performance of code U07.1 in patients discharged from the ED likely reflects differences in how ICD-10 codes are assigned in ED versus hospitalization data.Specifically, a physician ordering a PCR in the ED may not have the result of that test at the time when the individual is discharged from the ED.This would presumably increase the chances of emergency physicians charting other diagnoses in the medical records, resulting in other ICD- www.nature.com/scientificreports/ICD-10 code U07.1 in identifying SARS-CoV-2 infections, with PCR as the reference standard, found results very similar to ours 6 .CCEDRRN collected data very early in the pandemic when universal testing was not available to the community.Most people who were tested were tested in the hospital or the ED.Table 3. Operating characteristics of ICD-10 code U07.1 (lab confirmed-SARS-CoV-2 infection) in administrative data, with PCR testing as the reference standard, stratified by province. 1For hospitalized patients, represents British Columbia (BC), Nova Scotia (NS), and Saskatchewan.For patients discharged from the ED, necessary information on ICD codes was not available for BC, thus analyses represent NS and Saskatchewan.www.nature.com/scientificreports/Of all patients in the reference dataset, only 2.3% have a self-reported SARS-COV-2 positive from the community within the 14 days of the ED visit and of these, 12% were reconfirmed with a positive test in the ED; however, 90% were discharged with confirmed COVID-19 as the diagnosis.

Parameter
Thus, the vast majority of people were only tested in hospital or the ED.There is minimal overlap between the reference and the validation data set.
Four studies from the United States, again using positive PCR tests as the gold standard to validate ICD-10 discharge diagnostic codes, found results similar to ours [7][8][9][10][11] .One study from the Mass Gen Brigham system (which includes Massachusetts General Hospital, Brigham and Women's Hospital, and other allied hospitals across Massachusetts) found a much lower sensitivity than in other studies 12 .Their estimates varied considerably over the study period, with the highest estimate (from May 2020) being 60.9% (57.3-64.4%) 12.They attributed the lower sensitivity to delays in assigning discharge diagnostic codes, changes to PCR testing criteria and other factors 12 .
ICD-10 code U07.1 had only moderate agreement with PCR test positivity in those discharged from the ED.This is a potential concern, as many SARS-CoV-2-infected patients are discharged from the ED, representing a significant disease burden in the community.Our sensitivity estimates tended to be particularly low in older individuals (75+) discharged from the ED.This group is vulnerable to unfavourable outcomes after SARS-CoV-2 infection, and missing this group in public health surveillance or population-based research may considerably affect estimates.Health policy decisions relating to pandemic preparation, including resource planning, may not be optimal if based exclusively on ICD-10 code U07.1 administrative data, at least for patients discharged from the ED.This knowledge is important if we want a complete understanding of what ED and community resources may be required to manage future infectious disease crises (potentially including influenza surges and/or new health threats).

Conclusion
In conclusion, ICD-10 diagnostic codes for U07.1 within administrative health data identified most lab-confirmed SARS-CoV-2 infections in patients admitted to the hospital from the ED.Administrative health data diagnostic codes were less sensitive for identifying lab-confirmed SARS-CoV-2 in patients discharged from the ED.This limitation is important to acknowledge if ICD code U07.1 is used for SARS-CoV-2 case detection, for research and public health purposes.
The McGill University Research Ethics Board approved this study.The research ethics boards of participating institutions (see Supplement) reviewed and approved the study protocol with a waiver of informed consent for patient enrollment.All research was performed in accordance with relevant guidelines and regulations.

Table 2 .
Flowchart of records selection.*Participants from British Columbia and Quebec could not be linked with administrative data as they do not provide the relevant data to the Canadian Institute for Health Information.Also, participants from British Columbia were included only in the hospitalization-based analysis, as this province does not provide data on ICD-10 codes for ED visits.Operating characteristics of ICD-10 code U07.1 (lab confirmed-SARS-CoV-2 infection) from emergency department (ED) and/or hospital diagnostic codes, with PCR testing as the reference standard.For hospitalized patients, results represent British Columbia (BC), Nova Scotia (NS), and Saskatchewan.For patients discharged from ED, necessary data on diagnostic codes was unavailable for BC, thus analyses represent NS and Saskatchewan.

Table 4 .
Operating characteristics of ICD-10 code U07.1 (lab confirmed-SARS-CoV-2 infection) in administrative data, with PCR testing as the reference standard, stratified by sex and age.